Beyond Identity Coreference: Contrasting Indicators of Textual Coherence in English and German

نویسندگان

  • Kerstin Kunz
  • Ekaterina Lapshinova-Koltunski
  • José Manuel Martínez
چکیده

This paper focuses on the interaction of chains of coreference identity with other types of relations, comparing English and German data sets in terms of language, mode (written vs. spoken) and register. We first describe the types of coreference and the chain features analysed as indicators of textual coherence and topic continuity. After sketching the feature categories under analysis and the methods used for statistical evaluation, we present the findings from our analysis and interpret them in terms of the contrasts mentioned above. We will also show that for some registers, coreference types other than identity are of great importance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Experiments on bridging across languages and genres

In this paper, we introduce a typology of bridging relations applicable to multiple languages and genres. After discussing our annotation guidelines, we describe annotation experiments on the German part of our parallel coreference corpus and show that our interannotator agreement results are reliable, considering both antecedent selection and relation assignment. In order to validate our theor...

متن کامل

Sociocultural Identity in TEFL Textbooks: A Systemic Functional ‎Analysis

This study aimed at investigating shades of identity in TEFL textbooks. Most identity studies have focused on authors as knowledge producers. They have neglected authors' roles in constructing identity. Further, few scholars have considered disciplinary specific textbooks in their analyses of identity. Trying to bridge these gaps, we applied Halliday’s Systemic Functional Linguistics to investi...

متن کامل

A Tidy Data Model for Natural Language Processing using cleanNLP

The package cleanNLP provides a set of fast tools for converting a textual corpus into a set of normalized tables. The underlying natural language processing pipeline utilizes Stanford’s CoreNLP library, exposing a number of annotation tasks for text written in English, French, German, and Spanish. Annotators include tokenization, part of speech tagging, named entity recognition, entity linking...

متن کامل

Multilingual Coreference Resolution

In this paper we present a new, multilingual data-driven method for coreference resolution as implemented in the SWIZZLE system. The results obtained after training this system on a bilingual corpus of English and Romanian tagged texts, outperformed coreference resolution in each of the individual languages. 1 I n t r o d u c t i o n The recent availability of large bilingual corpora has spawne...

متن کامل

Exploring Authorial Identity in terms of Voice Intensity and Subject-Positioning in the Argumentative Writings of Male and Female Iranian Advanced EFL Learners

Academic writing is not just about presenting a set of ideas, but through the act of writing, the authors position themselves as individuals having particular identities which mostly reflect the dominant sociocultural values and practices of the discourse communities in which they are living and performing. The present study, using a mixed method approach, attempted to explore the evidences of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016